Conversation
…ompareExecArgs Routes ap.was_executed_with_args(...) through the new dynamicpathdetector.CompareExecArgs matcher so user-defined ApplicationProfile entries can express argument-vector patterns: '⋯' (DynamicIdentifier) — matches exactly one arg position. '*' (WildcardIdentifier) — matches zero or more consecutive args. Previously the function used slices.Compare for exact equality, which silently ignored wildcard tokens in the profile (a profile entry of [--user, ⋯] would not match runtime args [--user, alice]). The new matcher anchors both ends, treats '⋯' / '*' as wildcards, and falls back to literal-equality otherwise. 13 new unit subtests under TestExecWithArgsWildcardInProfile cover curl-with-any-URL, sh -c with arbitrary payload, ls -l with any directory, and echo with arbitrary trailing args, plus negative cases (literal-anchor mismatch, ⋯ over- and under-consumed, * mid-profile that fails to re-anchor on a literal). Storage dep bumped to k8sstormcenter/storage@3fc287210729 which adds the matcher (k8sstormcenter/storage#23). Note: default-rules.yaml R0001 'Unexpected process launched' currently uses path-only ap.was_executed. End-to-end exercise of the wildcard path requires either a rule update or a custom RuleSet — deferred to a follow-up so this PR stays scoped to the matcher wiring.
f4c1328 to
c20e29b
Compare
📝 WalkthroughWalkthroughUpdated storage dependency to a newer pseudo-version in ChangesWildcard Exec Argument Matching
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 9/10 reviews remaining, refill in 6 minutes. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/rulemanager/cel/libraries/applicationprofile/exec_test.go`:
- Around line 329-350: Add a positive+negative test pair that exercises
mid-profile "*" re-anchoring in the Execs test table: within the Execs slice
(type v1beta1.ExecCalls) add an entry where Args contains a literal token, then
"*" and then another literal token (e.g., Args: []string{"cmd", "*", "end"}) and
assert a match for a command whose args have arbitrary middle tokens but end
with the trailing literal (positive), and a non-match for a command where the
later literal is missing or not anchored (negative). Update the corresponding
expectations in the test (same table near the existing curl/sh/ls/echo entries
in exec_test.go) so the test validates that "*" can re-anchor to a later literal
token between anchors.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 469d529c-e5b1-49ab-b030-d6d5c0583d21
⛔ Files ignored due to path filters (1)
go.sumis excluded by!**/*.sum
📒 Files selected for processing (3)
go.modpkg/rulemanager/cel/libraries/applicationprofile/exec.gopkg/rulemanager/cel/libraries/applicationprofile/exec_test.go
| Execs: []v1beta1.ExecCalls{ | ||
| // curl any URL: --user must be literal, value is one position. | ||
| { | ||
| Path: "/usr/bin/curl", | ||
| Args: []string{"--user", "⋯"}, | ||
| }, | ||
| // sh -c with any trailing payload (zero or more args). | ||
| { | ||
| Path: "/bin/sh", | ||
| Args: []string{"-c", "*"}, | ||
| }, | ||
| // ls -l in any directory — single trailing position. | ||
| { | ||
| Path: "/bin/ls", | ||
| Args: []string{"-l", "⋯"}, | ||
| }, | ||
| // echo with any number of greeting words after a literal anchor. | ||
| { | ||
| Path: "/bin/echo", | ||
| Args: []string{"hello", "*"}, | ||
| }, | ||
| }, |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find profile arg patterns where "*" appears before a trailing literal anchor.
rg -nP 'Args:\s*\[]string\{[^}]*"\*"\s*,\s*"[^"]+"' pkg/rulemanager/cel/libraries/applicationprofile/exec_test.goRepository: k8sstormcenter/node-agent
Length of output: 51
🏁 Script executed:
# Check if the file exists and get basic info
wc -l pkg/rulemanager/cel/libraries/applicationprofile/exec_test.goRepository: k8sstormcenter/node-agent
Length of output: 135
🏁 Script executed:
# Read the relevant test sections mentioned in the review (lines 329-350 and 364-390)
sed -n '320,420p' pkg/rulemanager/cel/libraries/applicationprofile/exec_test.goRepository: k8sstormcenter/node-agent
Length of output: 3418
🏁 Script executed:
# Find the implementation of the matcher to understand wildcard behavior
fd -i "exec" pkg/rulemanager/cel/libraries/applicationprofile/ -type f | grep -v test | head -5Repository: k8sstormcenter/node-agent
Length of output: 240
🏁 Script executed:
# Find the implementation file that contains the matching logic
fd "exec" pkg/rulemanager/cel/libraries/applicationprofile/ -not -name "*test*" -type fRepository: k8sstormcenter/node-agent
Length of output: 300
🏁 Script executed:
# Check the implementation to understand wildcard matching behavior
find pkg/rulemanager/cel/libraries/applicationprofile/ -name "*.go" -not -name "*test*" | head -5Repository: k8sstormcenter/node-agent
Length of output: 358
🏁 Script executed:
# Read the exec.go implementation to understand wildcard matching semantics
cat -n pkg/rulemanager/cel/libraries/applicationprofile/exec.goRepository: k8sstormcenter/node-agent
Length of output: 8561
🏁 Script executed:
# Find the dynamicpathdetector implementation
fd "dynamicpathdetector" --type fRepository: k8sstormcenter/node-agent
Length of output: 51
🏁 Script executed:
# Search for CompareExecArgs function
rg "func.*CompareExecArgs" -A 30Repository: k8sstormcenter/node-agent
Length of output: 51
🏁 Script executed:
# Check if there's any documentation or references to wildcard behavior
rg "wildcard|re-anchor|\*" pkg/rulemanager/cel/libraries/applicationprofile/exec_test.goRepository: k8sstormcenter/node-agent
Length of output: 1114
🏁 Script executed:
# Read the full test structure more carefully around the test cases
sed -n '302,419p' pkg/rulemanager/cel/libraries/applicationprofile/exec_test.goRepository: k8sstormcenter/node-agent
Length of output: 4229
🏁 Script executed:
# Search for any documentation about wildcard re-anchoring behavior
rg "re-anchor|anchor" pkg/rulemanager/cel/libraries/applicationprofile/Repository: k8sstormcenter/node-agent
Length of output: 361
🏁 Script executed:
# Look for test comments explaining the matching contract
rg -B5 -A5 "contract|behavior|semantics" pkg/rulemanager/cel/libraries/applicationprofile/exec_test.goRepository: k8sstormcenter/node-agent
Length of output: 51
🏁 Script executed:
# Check if there are any other test files that might cover this pattern
find . -name "*test.go" -path "*/applicationprofile/*" | xargs grep -l "wildcard\|\*"Repository: k8sstormcenter/node-agent
Length of output: 543
🏁 Script executed:
# Check the entire exec_test.go to understand all test cases and verify the structure
wc -l pkg/rulemanager/cel/libraries/applicationprofile/exec_test.go && sed -n '385,410p' pkg/rulemanager/cel/libraries/applicationprofile/exec_test.goRepository: k8sstormcenter/node-agent
Length of output: 1012
🏁 Script executed:
# Verify if there's any evidence of CompareExecArgs implementation elsewhere
rg "CompareExecArgs" --type go -A 50 | head -100Repository: k8sstormcenter/node-agent
Length of output: 8958
🏁 Script executed:
# Look for the dynamicpathdetector package location
find . -name "dynamicpathdetector*" -o -name "*pathdetector*" 2>/dev/null | head -20Repository: k8sstormcenter/node-agent
Length of output: 51
🏁 Script executed:
# Check imports in exec.go to find the full import path for dynamicpathdetector
grep -n "import\|dynamicpathdetector" pkg/rulemanager/cel/libraries/applicationprofile/exec.go | head -20Repository: k8sstormcenter/node-agent
Length of output: 243
🏁 Script executed:
# Search for any references to how wildcard matching should work
rg "WildcardIdentifier|DynamicIdentifier" -B2 -A2 pkg/rulemanager/cel/libraries/applicationprofile/Repository: k8sstormcenter/node-agent
Length of output: 705
🏁 Script executed:
# Verify the exact line numbers mentioned in the review
sed -n '329,350p' pkg/rulemanager/cel/libraries/applicationprofile/exec_test.goRepository: k8sstormcenter/node-agent
Length of output: 623
🏁 Script executed:
# Check the "Also applies to" lines 364-390
sed -n '364,390p' pkg/rulemanager/cel/libraries/applicationprofile/exec_test.goRepository: k8sstormcenter/node-agent
Length of output: 1502
Add an explicit mid-profile * re-anchoring test case.
Current test coverage validates trailing * patterns well, but there's no explicit scenario where * must re-anchor to a later literal token. Adding one positive+negative pair will lock in the anchored behavior contract.
Proposed test additions
@@
Execs: []v1beta1.ExecCalls{
@@
// echo with any number of greeting words after a literal anchor.
{
Path: "/bin/echo",
Args: []string{"hello", "*"},
},
+ // wildcard in the middle must re-anchor to trailing literal.
+ {
+ Path: "/usr/bin/env",
+ Args: []string{"A", "*", "Z"},
+ },
},
})
@@
// echo hello *
{"echo hello world from test", "/bin/echo", []string{"hello", "world", "from", "test"}, true},
{"echo hello (no trailing args)", "/bin/echo", []string{"hello"}, true},
{"echo goodbye world — wrong literal anchor", "/bin/echo", []string{"goodbye", "world"}, false},
+
+ // mid-profile * re-anchoring
+ {"env A x y Z — wildcard re-anchors to trailing literal", "/usr/bin/env", []string{"A", "x", "y", "Z"}, true},
+ {"env A x y — missing trailing literal anchor", "/usr/bin/env", []string{"A", "x", "y"}, false},
}Also applies to: 364-390
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@pkg/rulemanager/cel/libraries/applicationprofile/exec_test.go` around lines
329 - 350, Add a positive+negative test pair that exercises mid-profile "*"
re-anchoring in the Execs test table: within the Execs slice (type
v1beta1.ExecCalls) add an entry where Args contains a literal token, then "*"
and then another literal token (e.g., Args: []string{"cmd", "*", "end"}) and
assert a match for a command whose args have arbitrary middle tokens but end
with the trailing literal (positive), and a non-match for a command where the
later literal is missing or not anchored (negative). Update the corresponding
expectations in the test (same table near the existing curl/sh/ls/echo entries
in exec_test.go) so the test validates that "*" can re-anchor to a later literal
token between anchors.
* test(component): port Test_28 to upstream's unified user-overlay label Upstream PR kubescape#788 (Replace AP and NN cache with CP) collapsed the two legacy workload-keyed caches into a single ContainerProfileCache that reads ONE pod label `kubescape.io/user-defined-profile=<name>` and uses <name> as the lookup key for BOTH the user ApplicationProfile and the user NetworkNeighborhood. The fork's earlier two-label scheme (`user-defined-profile` for AP + separate `user-defined-network` for NN, with potentially different names) is no longer honored — the second label is silently ignored. Port: - tests/resources/nginx-user-defined-deployment.yaml: drop the `user-defined-network` label, point the surviving label at one shared name `curl-28-overlay`. - tests/component_test.go Test_28_UserDefinedNetworkNeighborhood: create both AP and NN under that single shared name. Assertions unchanged — the test still verifies that the user NN's egress restriction (only fusioncore.ai allowed on TCP/80) is enforced once the pod is running. Verified locally: go vet -tags=component ./tests/... clean; go test -tags=component -run='^$' ./tests/... compiles cleanly. * feat(cel): re-port CompareExecArgs hookup onto upstream's CP cache PR #35's wildcard-aware exec arg matching needs reapplication on top of the new ContainerProfileCache (upstream kubescape#788) baseline. The original PR sat on the legacy applicationprofilecache, which has been deleted; the call site now reads cp.Spec.Execs from a ContainerProfile. Same semantic change as PR #35: '⋯' (DynamicIdentifier) — matches exactly one argument position. '*' (WildcardIdentifier) — matches zero or more consecutive args. Wiring: - pkg/rulemanager/cel/libraries/applicationprofile/exec.go: drop slices.Compare exact-equality on the cp.Spec.Execs loop; route through dynamicpathdetector.CompareExecArgs. - go.mod: bump fork storage replace to feat/exec-arg-wildcards tip (3fc287210729) which carries the matcher. - exec_test.go: re-add TestExecWithArgsWildcardInProfile (13 subtests across curl --user ⋯, sh -c *, ls -l ⋯, echo hello *, plus negative literal-anchor / under-consumed-⋯ / mid-profile-* cases). Mirrors the test set that lived on PR #35 before the upstream merge. Verified: full applicationprofile package green (`go test ./pkg/rulemanager/cel/libraries/applicationprofile/`). * feat(rules): R0040 'Unexpected process arguments' + Test_32 e2e R0040 is an additive companion to R0001. It evaluates: ap.was_executed(...) && !ap.was_executed_with_args(..., event.args) so it ONLY fires when the exec'd path IS in the user-defined profile (R0001 stays silent) but the runtime arg vector does not match any profile entry's pattern. With wildcard tokens supported by dynamicpathdetector.CompareExecArgs: '⋯' (DynamicIdentifier) — exactly one argument position. '*' (WildcardIdentifier) — zero or more consecutive args. Use case: profile entry {Path: /bin/sh, Args: [sh, -c, *]} flags 'sh -x ...' as drift while permitting 'sh -c <anything>'. Wiring: - tests/chart/templates/node-agent/default-rules.yaml: new R0040 CEL rule definition immediately after R0001, same MITRE tagging (TA0002/T1059) and same applicationprofile-anomaly tag set. - tests/chart/templates/node-agent/default-rule-binding.yaml: R0040 added to the all-rules-all-pods binding next to R0001. - tests/resources/curl-exec-arg-wildcards-deployment.yaml: new fixture, curl pod labelled with the unified kubescape.io/user-defined-profile=curl-32-overlay label. - tests/component_test.go: Test_32_UnexpectedProcessArguments with 4 subtests: 32a sh_dash_c_matches_wildcard_trailing — sh -c <cmd> matches profile [sh, -c, *] — R0040 silent. 32b sh_dash_x_mismatches_R0040 — sh -x <cmd> mismatches the literal -c anchor — R0040 fires. 32c echo_hello_matches_wildcard_trailing — echo hello world matches [echo, hello, *] — R0040 silent. 32d echo_goodbye_mismatches_R0040 — echo goodbye world mismatches the literal hello anchor — R0040 fires. Verified locally: go vet -tags=component ./tests/... clean; go test -tags=component -run='^$' ./tests/... compiles cleanly. End-to-end alert assertions run in CI. * deps(storage): bump to rebased feat/exec-arg-wildcards tip (0de34ebc) After rebasing storage feat/exec-arg-wildcards onto storage main, the matcher branch now sits on top of the latest fork main commit (352395a3 — Internal-field merge fix). Bump the node-agent storage replace to that new pseudo-version so this branch's tests run against storage main + matcher in one consistent baseline. Verified locally: 47/47 non-eBPF unit packages green; vet clean; the applicationprofile CEL package's TestExecWithArgsWildcardInProfile is 13/13 green; component-tests compile under the component tag. The two failing packages (pkg/containerwatcher/v2/tracers and pkg/validator) fail with the same pre-existing /sys/fs/bpf mount-permission error they have on every recent run — env, not code. * ci(component-tests): add Test_32_UnexpectedProcessArguments to matrix The component-tests workflow uses a hardcoded matrix list, not a dynamic discovery from the test source. Test_32 (added in a613cf6) must be listed explicitly to be picked up — without this entry the test is silently skipped. * fix(containerprofilecache): re-wire R1016 tamper alert + expand Test_31 Upstream PR kubescape#788 (Replace AP and NN cache with CP) deleted the legacy applicationprofilecache where the fork's emitTamperAlert (commit c2d681e 'Feat/tamperalert' #22) lived. After the merge, R1016 alerts no longer fired for tampered user-defined profiles, breaking Test_31_TamperDetectionAlert (passed 3/3 on main, fails on the merged branch — confirmed regression introduced by PR #36). This restores the contract: every cache load of a user-supplied ApplicationProfile or NetworkNeighborhood overlay re-verifies the signature, and emits an R1016 'Signed profile tampered' alert through the rule-alert exporter when the signature is present but no longer valid. Alert shape preserved from the legacy cache so dashboards and component tests keep matching. Implementation: - new file pkg/objectcache/containerprofilecache/tamper_alert.go: verifyUserApplicationProfile / verifyUserNetworkNeighborhood / emitTamperAlert / extractWlidFromContainerID. Self-contained; keeps containerprofilecache.go diff small. - containerprofilecache.go: new tamperAlertExporter field + SetTamperAlertExporter setter + verify hooks immediately after GetApplicationProfile / GetNetworkNeighborhood succeed in the user-overlay branch of addContainer. - cmd/main.go: wire the alert exporter via SetTamperAlertExporter after the cache constructor (kept the constructor signature unchanged to avoid blast radius on tests). The setter is nil-safe: when no exporter is wired, verification still runs and is logged but no alert is emitted — matches the legacy behavior for unit-tests-with-no-exporter. Test_31 expanded from one scenario to four subtests, each in its own namespace to avoid alert cross-contamination: 31a tampered_user_defined_AP_fires_R1016 — original regression case 31b untampered_signed_AP_no_R1016 — negative: clean signature 31c unsigned_AP_no_R1016 — signing is opt-in 31d tampered_user_defined_NN_fires_R1016 — parallel NN code path Verified locally: - go build ./... clean - go test ./pkg/objectcache/containerprofilecache/... green - go test ./pkg/signature/... green - go vet -tags=component ./tests/... clean - go test -tags=component -run='^$' ./tests/... compiles * test(component): Test_32 profile uses full-path argv[0] Empirical finding from CI run 25178930763 — Test_32's positive subtests (32a sh_dash_c_matches, 32c echo_hello_matches) fired R0040 when they should not. Cause: at runtime, the eBPF tracer captures argv[0] as the FULL exec path (e.g. "/bin/sh") rather than the basename ("sh"). My profile entries used basenames, so the matcher's first-position literal compare missed and the cache fell through to 'no exec entry matches' — R0040 fires. Aligns Test_32's profile with the convention already used by Test_27's wildcard_yaml_profile_allowed_opens fixture (known-application-profile-wildcards.yaml predecessor): argv[0] is the full path, subsequent positions are flags/values. Subtest expectations after this fix: 32a sh -c <cmd> → matches [/bin/sh, -c, *] → R0040 silent 32b sh -x <cmd> → -c anchor mismatch → R0040 fires 32c echo hello <…> → matches [/bin/echo, hello, *]→ R0040 silent 32d echo goodbye <…> → hello anchor mismatch → R0040 fires * test: AP-fixture linter (R-AP-* rules) + canonical reference profile Catches the class of bug Test_32 hit on its first CI run (PR #37 run 25178930763): profile entries used basename argv[0] ("sh") while the eBPF tracer captures the full path ("/bin/sh"), so the matcher silently misses and the rule fires when it shouldn't. Without a linter, this kind of fixture drift only surfaces in a 15-minute component-test run on a kind cluster — too late, too expensive. The linter (LintApplicationProfile / LintApplicationProfileYAML in tests/resources/aplint_test.go) is intentionally written as a pure function returning []Violation. Zero testing-package coupling on the hot path so it can be lifted into a future bobctl subcommand `bobctl lint <ap.yaml>` without rewrite — see backlog at ~/biz/sbob-business-plan/state.yaml. Rules: R-AP-01 — kind must be ApplicationProfile R-AP-02 — at least one container R-AP-03 — container.name non-empty R-AP-10 — exec.path absolute (catches relative paths) R-AP-11 — exec.path no wildcards (binary identity is exact) R-AP-12 — exec.args[0] equals exec.path or wildcard token (Test_32-style argv[0] basename trap) R-AP-13 — exec.args wildcard tokens are whole-word (no embedding) R-AP-20 — open.path non-empty + absolute R-AP-21 — open.flags non-empty (real auto-recorded opens always have ≥1) R-AP-22 — open.flags from known O_* set (catches typos) Each rule has a dedicated self-test that constructs a minimal-bad YAML and asserts the rule fires (5 negative tests). One positive test (TestLinter_canonical_AP_passes) parses the fork's reference known-application-profile.yaml — extracted from a real auto-recorded AP for curlimages/curl:8.5.0 in fea3b06 — and asserts zero violations. The reference YAML is restored to tests/resources/ so the canonical shape is in-tree and visible to humans + CI. Why a Go test rather than a shell linter: keeps the rule set in the same language as the storage matcher (`dynamicpathdetector`), so extending CompareExecArgs and the linter together stays cheap. Local-cluster organic learning was the original plan but k3s on OrbStack is currently flapping (LXC-related boot loop). The fea3b06 profile was extracted from real auto-learning at an earlier moment of stability, which is the next-best ground truth. * fix(tamper_alert): accept self-signed profiles, only flag actual tamper Switch verifyUser{ApplicationProfile,NetworkNeighborhood} from strict VerifyObject to VerifyObjectAllowUntrusted. The strict variant requires a Sigstore Fulcio trust chain and rejects locally-signed profiles even when the signature against the embedded cert is valid. That made Test_31b 'untampered_signed_AP_no_R1016' fire R1016 against an untampered AP, and broke Test_30's 'tampered_profile_loaded_without_ enforcement' subtest the same way. The intent is: tamper detection, not trust-chain enforcement. Matches cmd/sign-object/main.go's default verifier. * test(component): make Test_30 30b deterministic by re-execing inside Eventually The single-shot wget exec before Eventually was racy: if the eBPF event landed before the CP cache projected the user-defined AP, the rule manager evaluated against an empty baseline and R0001 never fired within the 60s polling window. Same race Test_29 already documents. Drive the wget exec inside the Eventually loop (10s tick, 120s deadline) so cache-load latency is absorbed by retries. Filter R0001 to comm=wget to make the assertion specific instead of catching any R0001. Drops the blind 15s pre-sleep and the redundant settle-then-recount block. * deps(storage): bump replace to f44fed80 (analyzer trailing-* fix) Picks up the upstream-PR-kubescape#316 review fix: trailing WildcardIdentifier now requires at least one regular-path segment, matching standard glob semantics. Closes the R0002 blind spot where '/etc/*' would silently match the bare '/etc' directory. * deps(storage): bump replace to 4ab95fb8 (PR #25 merged on fork main) Pulls in the full PR-kubescape#316 review fix set that just landed on storage main: proper splitPath-based trailing-* anchoring, DefaultCollapseConfigs() defensive-copy accessor, FindConfigForPath value-return, splitEndpoint defensive guard, plus the BenchmarkCompareDynamic baseline. * test(component): Test_33_AnalyzeOpensWildcardAnchoring End-to-end pin of the storage-side CompareDynamic contract through R0002. Each subtest deploys a fresh nginx pod with a user-defined AP carrying ONE Opens entry, then `cat`s a target path that probes a boundary case from the storage analyzer fixes (kubescape/storage kubescape#316 review by matthyx + entlein): - Anchored trailing `*` matches one OR MORE remaining segments — never zero. So /etc/* matches /etc/passwd but NOT bare /etc. - DynamicIdentifier (⋯) consumes EXACTLY ONE segment. - Mid-path `*` is zero-or-more, so /etc/*/* matches /etc/ssh (inner * consumes zero, trailing * consumes one). - Mixed ⋯/* combinations: ⋯ pins one, * consumes the rest. - splitPath normalises trailing slashes on both sides. 11 subtests covering: trailing_star_matches_immediate_child — basic /etc/* match trailing_star_matches_deep_child — multi-segment under prefix trailing_star_does_not_match_bare_parent — the security fix deep_prefix_trailing_star_does_not_match_parent — same rule, deeper ellipsis_pin_one_segment_then_literal — ⋯ rejects zero ellipsis_then_trailing_star_matches_two_* — ⋯/* combo, 2 levels ellipsis_then_trailing_star_matches_three_* — ⋯/* combo, 3 levels double_trailing_matches_one_child — /*/* mid-zero double_trailing_matches_deep_child — /*/* mid-one double_trailing_does_not_match_parent — /*/* needs ≥1 child trailing_slash_in_profile_normalises_to_literal — splitPath on profile Pinned at component level on TOP of the unit suite in storage/pkg/registry/file/dynamicpathdetector/tests/coverage_test.go. Both layers must agree — a drift in either lights up R0002 with a false positive or false negative. Matrix entry added to component-tests.yaml so the test runs in CI. * test(component): rework Test_33 negative cases to probe under R0002's monitored prefix R0002's CEL ruleExpression has a strict path-prefix filter: event.path.startsWith('/etc/') || event.path.startsWith('/var/log/') || ... All with trailing slash. Bare /etc and /var/log don't pass the filter, so R0002 never evaluates on those events — the matcher's bare-parent anchoring contract stays invisible at runtime even though the storage unit tests pin it. Probe one level deeper instead: /etc/ssl IS under the /etc/ monitored prefix, so the rule CAN see whether a /etc/ssl/* profile entry matches the bare /etc/ssl parent. Same security guarantee, observable layer. Reworked subtests: - trailing_star_does_not_match_bare_parent_under_monitored_prefix: profile /etc/ssl/*, cat /etc/ssl → R0002 fires - deep_prefix_trailing_star_does_not_match_parent: profile /etc/ssl/certs/*, cat /etc/ssl/certs → R0002 fires - ellipsis_requires_one_segment_not_zero: profile /etc/passwd/⋯, cat /etc/passwd → ⋯ requires one more segment - double_trailing_does_not_match_parent_under_monitored_prefix: profile /etc/ssl/*/*, cat /etc/ssl → R0002 fires The 7 positive subtests that already passed are untouched. Added a comment block documenting why we probe at /etc/ssl rather than /etc. * test(component): fix Test_28 + Test_31 31b flakiness Two distinct fixes for what looked like the same intermittent failures across PR #37 runs: Test_31 31b 'untampered_signed_AP_no_R1016' — root cause: storage's PreSave runs DeflateSortString on Syscalls (and Capabilities, Architectures), which sorts + dedupes. The signSignedAP helper signed the AP BEFORE pushing, against unsorted syscalls {socket, connect, read, write, close, openat}. After PreSave the stored AP had sorted {close, connect, openat, read, socket, write}, so the content hash differed from the signature → server-side verify correctly failed → R1016 fired even though the profile was untampered. Test_29 + Test_30 30b had the same fixture but didn't observe the bug because they only assert R0001 counts, never R1016. Pre-sort the syscalls in all three test fixtures so storage's normalization is a no-op on round-trip. Test_28 28a 'allowed_fusioncore_no_alert' — root cause: 15s post-deploy sleep wasn't always enough for the upstream ContainerProfileCache to project the user-defined NN. Failure mode is alert payload `profileMetadata.errorMessage:"waiting for profile update"` — the rule manager evaluated against an unloaded NN and fired R0005/R0011 spuriously. Bumped to 30s with a comment documenting why. A real fix would poll a cache-loaded signal but no such signal is exposed from outside the node-agent today. * test(component): sign-after-roundtrip in Test_31 to defeat content-drift R1016 false positive Test_31 31b 'untampered_signed_AP_no_R1016' kept flaking because the AP's content hash drifted between client-side sign and server-side verify across the K8s/storage roundtrip. Sources of drift include storage's PreSave normalisation (DeflateSortString, DeflateStringer, DeflateRulePolicies), signature/profiles GetContent's nil→empty-map mutation on PolicyByRuleId, and any K8s server-side defaulting of spec/metadata fields. Pre-sorting Syscalls in the previous fix only covered one of these. Sign-after-roundtrip closes the whole class: 1. Push the AP UNSIGNED to storage. PreSave runs, normalises content. 2. Read it back — this is what node-agent will see at verify time. 3. Sign THAT normalised content. 4. Delete the unsigned in-storage copy so deployAndWait can Create the signed version without an AlreadyExists conflict. 5. Strip server-managed metadata (resourceVersion / uid / etc.) from the returned AP so the second Create succeeds cleanly. Second push goes through deflate again. Idempotent on already-normalised content → stored bytes identical to signed bytes → content hash matches → verify succeeds → no R1016 false positive. Tampered subtests (31a, 31d) keep working: signSignedAP returns a known-good signed AP, the test mutates it post-helper, deployAndWait Creates the mutated version, storage round-trip preserves the mutation, and verify correctly detects the divergence. * test(component): bump Test_33 WaitForReady to 180s for cluster-pressure tolerance Test_33 deploys 11 fresh pods sequentially, one per subtest. Later subtests race against an increasingly loaded kind cluster — CP cache reconciler, alertmanager, prometheus all chew CPU at boot. 80s WaitForReady deadline timed out on the post-23ea224 run with 'workload not ready in ns ...' for early subtests once the cluster got busy. 180s gives headroom without changing total runtime regime. * deps(storage): bump replace to 43795bb4 (storage feat/exec-arg-wildcards tip — CRDs + CompareExecArgs) * test(aplint): drop redundant p := p loop var (Go 1.22+, copyloopvar lint) CodeRabbit (PR #38). Loop variable shadowing is no longer required since Go 1.22 made each iteration capture its own variable. The shadow trips golangci-lint's copyloopvar check. * fix(tamper_alert): R1016 dedup + use real WLID CodeRabbit (PR #38) flagged two issues on the refresh-loop tamper path: 1. Dedup R1016 (Major). The cache refresh re-runs verifyUser*() on every reconcile cycle, so a long-lived tampered profile would emit R1016 on every cycle and once per container reference. Track emitted alerts in a sync.Map keyed on (kind|ns/name@resourceVersion). LoadOrStore gives atomic 'first transition to invalid' semantics; a re-tamper at a new RV uses a fresh key and alerts again. Verification passing clears the key so future tampers re-alert. 2. Pass real WLID (Major). emitTamperAlert previously called extractWlidFromContainerID(containerID) — but containerID is a runtime ID like 'containerd://...' which GenericRuleFailure.SetWorkloadDetails parses as a WLID and silently drops kind/name/cluster attribution. Thread sharedData.Wlid (already in scope at the call site in containerprofilecache.go) through verifyUser*() into emitTamperAlert. Drop extractWlidFromContainerID — no longer needed. Test_31_TamperDetectionAlert exercises this path end-to-end; expecting it to keep passing (one alert per tampered profile, with proper workload attribution in Alertmanager). * deps(storage): bump replace to b0d68d3d (empty-Args wildcard match) Picks up storage's CompareExecArgs fix for path-only Execs entries (CodeRabbit PR #38, finding #3). With this bump, R0040 no longer fires on every legit exec of a path that the user-defined profile allowed but didn't constrain by argv. * fix: address CodeRabbit second-review batch on PR #38 Three CR concerns addressed in one commit (all submitted 2026-05-04 10:09): 1. tamper_alert.go (Major) — only emit R1016 on actual signature mismatch. Previously every error from VerifyObjectAllowUntrusted (hash computation, verifier construction, malformed annotations, signature mismatch) was treated as a tamper event. With EnableSignatureVerification=true that meant a transient operational failure could drop a valid overlay AND emit a false R1016. Fix: - Add signature.ErrSignatureMismatch sentinel in pkg/signature/annotations.go - Wrap the signature-fail return in verify.go with the sentinel (Go 1.20+ multiple-%w form) - Classify in tamper_alert.go via errors.Is(err, ErrSignatureMismatch); operational errors log with a 'NOT tamper' tag, do NOT touch the tamperEmitted dedup map, and do NOT emit R1016. They still honour strict-mode (return !EnableSignatureVerification) so behaviour is conservative. Adds tamper_alert_test.go: pins LoadOrStore semantics, confirms each operational-error variant returns false on errors.Is(ErrSignatureMismatch), smoke-tests the sentinel value. 2. aplint_test.go:221 (Minor) — replace strings.Contains substring check for fixture-type detection with proper YAML-header parse. The substring form silently skipped fixtures with quoted/non-canonical 'kind' values AND would false-positive on nested OwnerReferences with the same substring. Now uses sigs.k8s.io/yaml (already imported) to unmarshal into a {Kind string} struct and branch on the parsed value. 3. aplint_test.go R-AP-12 (Major, 'Heavy lift') — REJECTED with reasoning to be posted as CR reply. Evidence shows R-AP-12 enforces the actual runtime contract: - pkg/rulemanager/cel/libraries/parse/parse.go's getExecPath returns args[0] (the full binary path) - ap.was_executed_with_args(containerID, parse.get_exec_path(event.args, event.comm), event.args) passes the FULL argv to the matcher - Integration test pkg/rulemanager/cel/libraries/applicationprofile/ integration_test.go uses ["/bin/bash", "-c", ...] — full argv - Real fixture tests/resources/known-application-profile.yaml uses args: ["/bin/sleep", "infinity"] with args[0] == path The compare_exec_args_test.go cases that look like flags-only ([-s, ⋯]) are matcher-isolation unit tests, not the wired contract. Removing R-AP-12 would let users author flags-only profiles that silently fail to silence R0040 in production. Bumps storage replace to a7e6234349ab (storage main HEAD post-PR #27). * fix: address CodeRabbit third-review batch on PR #38 (0cf4a50) Three new findings on the May 9 review batch: 1. aplint_test.go R-AP-13 (Major) — embedded * was not flagged, only embedded ⋯. An arg like "foo*bar" linted clean, defeating the purpose of the wildcard-must-be-standalone rule. Added a parallel check for embedded wildcardIdentifier mirroring the existing dynamicIdentifier check. 2. aplint_test.go canonical-fixture skip path (Major) — t.Skipf on missing reference fixture silently disabled the gold-standard test if someone deleted/renamed it. Switched to t.Fatalf with a message that explains the file's role. Also switched the YAML-parse-failure path from Skipf to Fatalf — un-parseable YAML in tests/resources/ is a fixture-quality bug, not something to skip past. Kept Skipf only for the kind!=AP path, which correctly handles the legit non-AP fixtures (Deployments, Services etc.) that share the directory. 3. tamper_alert_test.go (Trivial nitpick, 'Low value') — added an integration-style test that exercises the full verifyUserApplicationProfile path: real signature.SignObjectDisableKeyless, tamper the spec, call verifyUserApplicationProfile, assert tamperEmitted carries the key. Then re-sign at a new RV and assert the dedup is cleared. Confirms the wiring 'verifier returns ErrSignatureMismatch → classify as tamper → LoadOrStore in dedup map' actually fires, not just the LoadOrStore call in isolation. Uses the real cosign adapter — no mocking needed. * fix(test): make dedup-clearing assertion non-trivial CodeRabbit nitpick on PR #38 (tamper_alert_test.go:159): the prior version of TestVerifyAP_TamperedProfile_PopulatesDedupMap bumped the profile's ResourceVersion to '43' before the re-sign, then asserted that the key for RV='43' was absent from tamperEmitted. But '43' was never added in the first place — only '42' was added during the tamper phase. The assertion was trivially true. Drop the RV bump: re-sign at the SAME RV='42' so that verifyUserApplicationProfile takes the verify-clean branch and Deletes the existing dedup entry. The assertion now actually exercises the clearing path. Production code unchanged. * fix(ci): drop 'permissions: read-all' from reusable workflows Chore PR #39 added 'permissions: read-all' to every workflow file including the workflow_call reusables. That broke the reusable-call contract: a called workflow's top-level permissions are the FLOOR the caller must grant. Callers of these reusables (pr-created.yaml's pr-created job, pr-merged.yaml's pr-merged job, etc.) only grant a narrow set of scopes (id-token:write, packages:write, security-events: write, pull-requests:write, contents). They do NOT grant the full read-all set (actions:read, artifact-metadata:read, attestations:read, checks:read, deployments:read, discussions:read, issues:read, models:read, vulnerability-alerts:read, packages:read, pages:read, repository-projects:read, statuses:read, id-token:read). Result: every PR opened on this fork since chore #39 merged (May 6) has had pr-created.yaml startup_failure with: Error calling workflow ... incluster-comp-pr-created.yaml@<sha>. The workflow is requesting 'actions: read, ...', but is only allowed [...] Fix: strip top-level 'permissions: read-all' from the four reusables: - benchmark.yaml (also direct-trigger; falls back to per-job perms) - go-basic-tests.yaml - incluster-comp-pr-created.yaml - incluster-comp-pr-merged.yaml Each reusable's per-job 'permissions:' blocks remain intact and correctly request only what the job needs. Callers' existing grants are sufficient. Top-level 'permissions: read-all' stays on the OUTERMOST workflows (pr-created.yaml, pr-merged.yaml, build.yaml, bypass.yaml, component-tests.yaml, sign-object.yaml, scorecard.yml) — they're event-triggered, not workflow_call'd, so the read-all default still hardens GITHUB_TOKEN there. --------- Co-authored-by: Entlein <eineintlein@gmail.com>
Summary
Routes
ap.was_executed_with_args(containerID, path, args)through the newdynamicpathdetector.CompareExecArgsmatcher (k8sstormcenter/storage#23) souser-defined ApplicationProfile entries can express argument-vector patterns:
⋯DynamicIdentifier*WildcardIdentifierAnything else is literal-equality. Both ends are anchored — runtime args must
be fully consumed by the profile vector (literal /
⋯/*absorbing run).Previously the function used
slices.Comparefor exact equality, so a profileentry like
Args: [\"--user\", \"⋯\"]would never match runtime args[\"--user\", \"alice\"]— wildcard tokens were silently treated as literals.Why
Sibling work to the storage wildcard-port endpoint fix (#33). User-defined
profiles already store path + args; the path side has had wildcard support for a
while, the arg side did not.
Tests
TestExecWithArgsWildcardInProfile— 13 subtests covering:⋯value)sh -cwith arbitrary payload (*)ls -lwith any directory (⋯requires exactly one)echo hello *(zero or more trailing words)⋯over- and under-consumed,*mid-profile that fails to re-anchor on a literal.TestExecWithArgsInProfilestill passes — exact-match behaviorpreserved when no wildcards are present (literal vectors are a strict
subset of the wildcard grammar).
pkg/rulemanager/...andpkg/storage/...suites green; vet clean.Out of scope
tests/chart/templates/node-agent/default-rules.yamlrule R0001 ("Unexpectedprocess launched") currently uses path-only
ap.was_executed. Exercising thewildcard arg path end-to-end requires either updating R0001 to also call
ap.was_executed_with_argsor shipping a separate "Unexpected processarguments" rule. Both are larger semantic decisions — kept out of this PR so
the matcher wiring lands cleanly first. Component-test for the wildcard path
will follow once the rule shape is decided.
Test plan